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Locally  Optimal  Subset  Selection  Procedures  Based  on  Ranks* 

by 

Shanti  S.  Gupta,  Purdue  University 
and 

Deng-Yuan  Huang,  Academia  Sinica,  Taipei,  Taiwan 

In  practice,  it  sometimes  happens  that  the  actual  values  of  a random 
variable  can  only  be  observed  under  great  cost  or  not  at  all,  while  their 
ordering  is  readily  observable.  This  occurs  for  instance  in  life-testing 
when  one  only  observes  the  order  in  which  the  parts  under  investigation 
fail  without  being  able  to  record  the  actual  times  of  the  failure.  Prob- 
lems of  this  type  suggest  the  investigation  of  decision  rules  based  on 
ranks.  Although  the  distributions  of  rank  statistics  are  usually  very  in- 
volved, the  resulting  rules  are  often  simple.  Another  advantage  of  rank 
procedures  is  that  under  the  hypothesis  that  all  distributions  are  identical, 
the  distribution  of  the  ranks  does  not  depend  on  the  underlying  distribution. 
For  this  reason  rank  procedures  are  sometimes  referred  to  as  nonparametric 
rules.  Hajek  and  Sidak  [3]  and  others  have  developed  an  elegant  theory  of 
rank  tests.  Contributions  to  some  related  problems  have  also  been  made  by 
Puri  and  Sen  [6].  However,  very  little  work  has  been  done  for  multiple  deci- 
sion problems  based  on  ranks.  Gupta  and  McDonald  [1],  McDonald  [4]  and  Nagel 
[5)  have  investigated  several  subset  selection  rules  based  on  ranks.  Nagel  [5] 

‘This  work  was  supported  by  the  Office  of  Naval  Research  Contract  N00014- 
75-C-0455  at  Purdue  University. 
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tried  to  obtain  some  locally  optimal  rules  based  on  the  rank  test  theory. 

We  are  interested  in  deriving  locally  optimal  ranking  and  selection  proce- 
dures. Although,  the  criteria  of  optimality  is  different  from  Nagel's 
idea,  the  form  of  the  procedures  is  the  same.  It  has  been  shown  that  pro- 
cedures of  this  tyi)e  have  many  good  properties. 

From  each  of  the  k independent  populations  a fixed  number  of  observations,  say, 
n is  taken.  The  distribution  is  assumed  to  depend  on  a parameter  9 and  the 
form  of  the  distribution  is  also  assumed  to  be  known.  Concluding  that 
e^,  i=l,,..,k,  are  or  are  not  equal  may  not  be  sufficient.  Often  the 

experimenter  is  interested  in  ascertaining  which  population  is  associated  with 

> 

the  largest  (or  smallest)  6,  which  populations  possess  the  t largest  (or 
smallest)  0,  etc.  Suppose  the  experimenter  is  interested  in  identifying 
which  one  of  the  k populations  possesses  the  largest  0,  the  so-called  "best" 
population.  The  subset  selection  approach  to  this  problem  is  to  select  a 
small  subset  which  is  garanteed  to  contain  the  best  population  with  prob- 
ability P*,  the  basic  probability  requirement  in  these  procedures.  The 
selection  of  a subset  including  the  best  population  is  called  a correct 

V 

selection  (CS) . In  this  paper,  we  are  interested  in  deriving  procedures 
which  satisfy  the  basic  P*-condition  and  locally  maximize  the  probability  of 
a correct  selection.  An  example  is  given  to  illustrate  the  application  to  a 
problem  in  regression  analysis. 

From  each  of  the  populations  11^,  i=l,2,...,k,  we  take  n observations 
, . . . Let  denote  the  rank  of  X^^  in  the  pooled  sample  of  the 

N— kn  observat ions  * * **^ln'^21'  ' * **^2n'  * ' * * ^k  1 ' * * ■ * * 

We  use  the  following  definitions  of  Nagel  [5]: 
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Definition  1.  A rank  configuration  is  an  N-tuple  A=  (Aj , . . . . ,Af^) , 

Aj  e { 1 ,2, . . . . ,k} , where  A^=j  indicates  that  the  ith  smallest 

observation  in  the  pooled  sample  comes  from  IK  i.e.  there  exists 

an  ^ such  that  lK,=i  holds. 

H 

Let  5^  ={A}  denote  the  set  of  all  rank  configurations  for  the 
pair  k and  n which  are  kept  fixed  in  these  considerations.  Aj^ 
denotes  the  rank  configuration  of  x=  (xj , . . . . ,xj^) . For  a fixed  A 
let^={x  €^|a  =A},  where  ^={x:  x=(x  x^^)}.  The  decision 

space  ^ consists  of  the  2*^-1  nonempty  subsets  d of  the  set  (1,2, 

...,k}  and  the  empty  set: 

^={d|dc{l,2,...,k}}. 

A decision  is  the  selection  of  a subset  of  the  k populations.  The 
fact  that  ied  indicates  that  is  included  in  the  selected  subset 
if  decision  d is  made. 

Definition  2.  A rank  selection  rule  is  a measurable  function  6 

defined  on  provided  that  for  each  Ae^  (i)  6(A,d)  > 0 and 

(iij  I 6(A,d)=l  hold, 
dc^ 

i.et  ACA.d)  denote  the  probability  that  the  decision  d is  made  if  the 
rank  configuration  A is  observed. 

Definition  3.  A subset  selection  rule  R based  on  ranks  is  a mea- 
surable mapping  from  5^’  into 

R:  A-^Cpj  iA),...,p,^(A)) 

where  p-  tA)=]^  6(A,d)  (summation  over  all  subsets  containing  i). 

^ dSi 

If  the  take  on  the  values  0 and  1 only  then 
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if  d={i |p^(A)=] }, 

10  otherwise, 

i.e.  a non-randomi zed  procedure  is  completely  determined  by  its 
individual  selection  probabilities.  Nagel  [5]  has  shown  that 
this  is  not  true  in  general. 

Let  the  distribution  of  be  given  by  a density  function 
ffx.Oj)  from  a one-parametric  family  with  the  9j's  belonging  to 
some  interval  , which,  without  loss  of  generality,  can  be  assumed 
to  contain  0.  Let  = {ejo  = (0j , . . . , 9j^)  } . Furthermore,  let  the  family  f(x,0) 
have  the  following  properties: 

Condition  A.  (i)  f(x,e)  is  absolutely  continuous  in  0 for  almost 
every  x; 


if 


r 


(ii)  the  limit 

(1)  f(x,0)=lim  f [f(x,0) 
0-vO  “ 

f i i i I 


f(x,u)]  exists  for  almost  every  x; 


00  -J 

(2)  lim  / |f(x,0)|dx=j  !f(x,0)|dx  < * 

9->-0 


holds,  with  f(x,0)  denoting  the  partial  derivative  with  respect  to  e.  Note  tiiat  the 

existence  of  f(x,0)  for  almost  every  0 is  ensured  at  every  point  x such  that 
f(x,0)  is  absolutely  continuous  in  0.  This,  however,  does  not  make  the 
condition  (ii)  superfluous. 

We  know  that  if  a density  f(x)  is  absolutely  continuous  and  satisfies 


f"| f'(x)|dx  < » 

then  the  family  f (x,0)=f(x-0)  satisfies  the  conditions  (i),(ii),  and  (iii) 
(see  Ii),  p.  "."i)  , wlicrc  f'(x)  = And  if  a density  ffxl  is  absolutely 


continuous  and  satisfies 


/ |xf '(x) |dx  < 


then  the  family  f(x,0)=e  ^f[(x-u)e  also  satisfies  the  conditions 


(i),  UiJ,  and  (iii),  (see  [3],  p.  73). 

Our  goal  is  to  construct  a selection  rule  6 based  on  ranks  (6  is 
conditional  on  an  observed  rank  configuration  A)  such  that 


(3)  inf  1^  (CS|(S,A)=  P*  where  = = holds  and 

- 

(4)  P.(CS|6,A)  is  as  large  as  possible  for  all  0 in  a neighborhood 

0 

of  OqcUq. 

Since  in  12^=  {0  |ej=. . . = 9j^}  the  distribution  of  the  ranks  does  not 
depend  on  the  underlying  distribution  of  the  X^'s,  Pg(CS|5,A)  is  constant  for 
0051^.  Hence,  it  suffices  to  choose  any  point  in  to  satisfy  (3). 

Hence  without  any  loss  of  generality,  we  assume  0^  = (0,0,..., 0). 

The  probability  that  rank  configuration  A is  observed  under  0 with 


^2  N 


(5)  l>^(A)=  / / .../  n f(x.,e^  )dXj...dXj^ 

..  —00  — 00  - 00  X=1  i 


^2  N 


/ / .../  n f(x^,0)dXj,.  ..dXjj 

— 00  — OO  — OO  i = 2 


X X 

N 2 N 


/ / .../  [ n f(x^,0^  )-  n f(x^,o)]  dx^ 


-OO  -00  — 00  1=1 


i i=l 


^2  N 


= / / .../  n f(x.,0)  dXj...dX|^ 

-00  - 00  — 00  1=1 


► I e.  I If  .../ 
i = l j -«  -00 


"'n  ^^2  f(x.,0.)-f(x  ,0)  j-1  N 

/ .../  — ^ — n f(x^,0)  n )dxj...dx^ 

-00  - 00  1 C-1  6 = j‘*’l  C 


«x 

^0*  ^ ejA^(A,0) , where 
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n f(x.,0)  = l,  for  in=0, 
i = l ^ 


^^2  N 

\f  / / •••/  fCx^,0)dx^ . . .dXj^ 

— OO  — OCJ  —fiU  1=1 


«>  ^2  f(x.  ,e.)-f(x.  ,0)  j-l  N 

A-(A,e)=  Ilf  .../ ^ II  f(x  ,0)  n f(x  ,e.  ) 

j -.o  e.  e=l  e=j  + l e 

. • 1 
A . = 1 
J 

dx  . . .dx.,  . 1 < i < k. 

Let  G denote  the  group  of  permutations  g of  the  integers 


(6)  g(l  ,2, . . . ,k)  = (gl,g2,  . . . ,gk) . 

Let  h be  the  inverse  permutation  of  g,  h=g  and  define 

(7)  g(x^,..,,Xj^)  = (Xj^^,,..,Xj^j^) 

and  for  d c £,  gd={i|hied}.  Also  for  any  A e C,  let  g be  defined  as 


follows : 


gA=(gAj, 


g is  thus  induced  by  g.  Let  G be  the  group  {g}.  And  let  G(i,jj  be 

the  following  subset  of  G 

(9)  G(i,j)=(geG|gi=j}. 


Definition  4.  A selection  rule  6 is  invariant  under  permutation  if 


and  only  if 


6 (gA,gd)=6 (A,d)  for  all  Ae  C,  de  £,  geG,  gcG. 


Assume  that  11^,  is  the  best  population  then 
(11)  P (CS(6,A)=E.  I 6(A.d)  = E-p,(A). 


From  the  modified  definition  (11),  it  follows  that  a subset  selection  rule  R is 
invariant  under  permutation  if  and  only  if 

(12)  (Pj(iA),...,p,^(gA))  = g(Pj(A),...,Pj^(A)) 
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= I / ^ I I f(x.,0)d0|dx 

j i 0 ■* 

A.  = i 
J 


' I / ^ / |ffx  ,O)|d0dx 
j i 0 ■’  ■’ 

A.  = i 

J 

0. 

= I /|ffx  0)|dx  d0. 
j i 0 ■’  ■' 


and  a similar  result  can  be  obtained  for  0.<  0. 

1 

Hence 


f(x  0 )-f(x  0)  j-1 

lim  sup  [ /.../ J — J n 


0->0  J 

0 A.=i 
J 


n f(x  0)  n fCx^.e^  )dx, 

1=1  * e=j  + i ^ I 


< I /|f(x  ,0) |dx  . 

j ■'  ■’ 


By  Dominated  Convergence  Theorem,  we  have 


1 im  A. (A, 0) 


- ""2  f(x,.0.)-f(x  0)  j-1  N 

= lim  If  f ...f  2 n f(x  .0)  n f(x  .6  )dx 


j 

^A  .»i 
J 


“ . , '■''e’  A ' "1' 

e=l  e=3+l  e 


= I I /•••/  f(Xj.O)  n f(Xg,0)dx^. , .dXj^. 

J .00  - 00  . 00  6~1 

A.  = i e=fj 

Now,  there  exists  an  e>0  such  that  0<|0j|<e,  for  all  i,  l<i<k,  A^(A,0)  is 


approximately  equal  to 


^2  . 


I j f...f  f(x  ,0)  n f(Xg,0)dx^...dXj^ 

-j  .00  . 00  oo  *'  6~1 

;=i  e+j 


= ^ B.=A.(A),  1<  i<  k,  where 

j J 1 


A.  = i 
J 


^2. 


/ /•••/  '•  f(Xg,0)dx^. . .dXj^. 

.00  -oo  »oo  G=  1 

e+j 


We  have 


k k 

I I 0hiN^^^=  ^ ^ 

geG(k,k)  i = l i=l  geG(k,k) 

k-1  k-1 

=(k-2)!  I I A^(A)  + (k-l)!e  A (A) 

e=l  i=l 


= (k-2)!  {(U-e|^)V  + (k0j^-U)A^(A)},  where 


k k nk 

U=  J]  0.  and  V=  ^ A.(A)=  ^ B.,  independent  of  A. 

; it  .^1  ._1 


i=l 


i=l  i=l 


V is  zero  if  f is  absolutely  continuous  and  satisfies 
00 

/ |f’(x)|dx<“  ([3],  p.66).  Since  0j^  S 0^^,  i = l,...,k-l,  it  follows  that 
— 00 

k 


k0, -U>  0.  Hence  ^ 

gEG(k.k) 

Aj,(A),  thus  we  have  proved 


^ ®hi^i^^^  ® nondecreasing  function  in 

i=l 

the  following  result. 


Theorem.  If  f(x,0)  satisfies  the  condition  A,  then  for  any  i,  (1  < i < 
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Pi(A)=  < P. 


if  A. (A)  > c 
1 


satisfies 

inf  P (CS|6,A)  > P*  such  that  P (CS|6,A)  is  as  large  as  possible 

S - 5 

in  the  neighborhood  0<  1 6^^  | <G,l^i^k,  for  given  e > 0.  The  constants 
and  c are  determined  by 

(12)  I P (A)  + P I Pg  (A)=P*. 

A -0  ^ A -0 


A^(A)  > c 


A^(A)=c 


Note  that  this  locally  optimal  rule  is  based  on  weighted  rank  sums 
using  the  scores 


B^=  / u^"^(l-u)*^‘^cp(u,f)du. 


where 


cpcu,f)= 

f(F"^(u.0),0). 


Remark : (1)  In  problems  concerning  scale  parameters,  weuscthe  condition 

/|xf'(x)|dx  < “ to  replace  /|f'(x)|dx  < <»  to  obtain  V = 0. 

(2)  If  the  assumption  6g=(0,...,0)  is  replaced  by  the  more  general  one  0q=(6, 

(15)  cp(u,f,0)=^i^i-]iH»0-^ 

f(F"'(u,e),e) 

which  in  general  depends  on  6.  However,  it  is  independent  of  6 if  6 
is  a location  or  scale  parameter. 

Nagel  [5]  has  shown  that  the  rules  of  this  type  are  just  provided 
that  Bj’s  are  non-decreasing  in  i,  which  for  location  parameters  is 


11 


true  if  and  only  if  f(x)  is  strongly  unimodal , i.e.  if  - logf(x) 

is  a convex  function  ([3],  p.20).  it  follows  from  Nagel  [s]  that  inf 

eee 

P (CS[6,A)  = inf  P {CS|6,A)  for  a just  selection  rule  6.  If  f(x  0) 

normal  density  with  mean  0 and  variance  1,  then  (f(u,f)  = where 

cumulative  distribution  function  of  the  standard  normal  random  variable 
the  scores  can  be  evaluated  as 

B.=  / u^  ^(l-u)^~^'J>  ^(u)du. 

^ 0 

Iff  has  the  logistic  density 


f(x,0)=e  ! [1+e 

then  9(u,f)=2u-l  which  leads  to  equally  spaced  scores:  B^=a+ib 
where  the  actual  values  of  a and  b > o are  irrelevant.  Hence 
the  rule  in  [1]  and  [4], 

n 

Rj!  Select  n.  iff  R. . ^ c is  locally  optimal  on  the  respective  P* 
^ j=l 


level  if  the  underlying  distributions  are  logistic  with  location 
parameter  0.  Nagel  [5]  has  discussed  a different  type  of  optimality 
of  R3. 

If  f (x^j , 0^)=f  (Xj^^ -0^  ^ij^’  1 < i < k,  j = l,...,n,  the  regression 

in  location  case,  then  we  have  for  any  i, 

A.  (A)=  y C. .B. , 

1 j ij  J 

A.=i 

J 

The  procedure  to  select  the  population  associated  with  the  largest 


growth  rate  0^'s  is  as  follows:  for  any  i. 


Pi 

V 0 


if  A^(a)  > c. 


is  the 

is  the 
Thus , 


A related  problem  has  been  considered  by  Gupta  and  Huang  [2]  for  the 
largest  slope  with  unknown  initial  weight  for  nonparametric  densities. 
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